Frame-Level Vocal Effort Likelihood Space Modeling for Improved Whisper-Island Detection
نویسندگان
چکیده
In this study, a frame-based vocal effort likelihood space modeling framework for improved whisper-island detection within normally phonated audio streams is proposed. The proposed method is based on first training a traditional Gaussian mixture model for whisper and neutral speech, which is then employed to extract a newly proposed discriminative feature set entitled Vocal Effort Likelihood (VEL), for whisper-island detection. The VEL feature set is integrated within a BIC/T-BIC segmentation scheme for vocal effort change point(VECP) detection. With the dimension-reduced VEL 2-D feature set, the proposed framework has reduced computational costs versus prior method [1]. Experimental results using the UT-VocalEffort II corpus for whisper-island detection using the proposed framework are presented and compared with a previous algorithm introduced in [1]. The proposed algorithm is shown to improve performance in VECP detection with the lowest MultiError Score(MES) of 6.33. Furthermore, very accurate whisperisland detection was obtained using proposed algorithm, which is useful for sustained performance in speech systems (ASR, Speaker-ID, etc.)which might experience whisper speech. Finally, experimental performance achieves a 100% detection rate for the proposed algorithm, which represents the best whisperisland detection performance with lowest computational costs available in the literature to date.
منابع مشابه
Advancements in whisper-island detection within normally phonated audio streams
In this study, several improvements are proposed for improved whisper-island detection within normally phonated audio streams. Based on our previous study, an improved feature, which is more sensitive to vocal effort change points between whisper and neutral speech, is developed and utilized in vocal effort change point(VECP) detection and vocal effort classification. Evaluation is based on the...
متن کاملAn entropy based feature for whisper-island detection within audio streams
Non-neutral speech, especially whispered speech, has strong negative impact on speech system performance. It is therefore necessary to detect whisper-islands embedded within neutral speech prior to subsequent processing steps. Detecting whisper-islands in speech audio streams can contribute to improved modeling, speech analysis, and understanding. Speech technology can also benefit by allowing ...
متن کاملAn Entropy based Feature for Whisper-Island Detection within Audio Streams
Non-neutral speech, especially whispered speech, has strong negative impact on speech system performance. It is therefore necessary to detect whisper-islands embedded within neutral speech prior to subsequent processing steps. Detecting whisper-islands in speech audio streams can contribute to improved modeling, speech analysis, and understanding. Speech technology can also benefit by allowing ...
متن کاملAnalysis and classification of speech mode: whispered through shouted
Variation in vocal effort represents one of the most challenging problems in maintaining speech system performance for coding, speech and speaker recognition. Changes in vocal effort (or mode) result in a fundamental change in speech production which is not simply a change in volume. This is the first study to collectively consider the five speech modes: whispered, soft, neutral, loud and shout...
متن کاملInteractive Analysis of Space Frame Raft Soil System
This study presents a new approach for physical and material modeling of space frame-raft-soil system. The physical modeling consists of a modified Thimoshenko beam bending element with six degrees of freedom per node to model the beams and columns of the superstructure, a modified Mindlin's plate bending element with five degrees of freedom per node to represent the structural slabs and raft, ...
متن کامل